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Abstract 

We consider the typical distance between vertices of the giant component of a random 
intersection graph having a power law (asymptotic) vertex degree distribution with infinite 
second moment. Given two vertices from the giant component we construct Op (log log n) 
upper bound for the length of the shortest path connecting them. 
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1 Introduction 

Given a collection of subsets 5(1), . . . , S(n) of the set W = {wi, . . . , w m } define the intersection 
graph on the vertex set V = {i>i, . . . ,v n } such that Vi and Vj are joined by an edge (denoted 
Vi ~ Vj) whenever S(i) D S(j) ^ 0, for i ^ j. Assuming that the sets S(i), i = 1, . . . ,n, are 
drawn at random we obtain a random intersection graph. 

Random intersection graphs have applications in various fields: design and analysis of secure 
wireless sensor networks [TJ, [5], modelling of social networks [6], statistical clasification [8], see 
also |12j . |13j . Usually, in applications the number of interacting nodes (vertices) is large and it 
is convenient to study the statistical properties of parameters of interest. 

We consider a class of random intersection graphs, where m is much larger than n and where 
the random subsets S(i), i = 1, ... ,71, are independent. Moreover, we assume that for every i, 
the distribution of S(i) is a mixture of uniform distributions. That is, for every k, conditionally 
on the event \S(i)\ = k the random set S(i) is uniformly distributed in the class of all subsets 
of W of size k. In particular, with denoting the distribution of \S(i)\ we have, for every 
A C W, P(S(i) = A) = (|™|) P*i(|j4|). The random intersection graph corresponding to the 
sequence of distributions P* = (P*i, . . . , P* n ) is denoted G(n, m, P*). 

Assuming that as n,m — > oo the asymptotic distributions of y/n/m \S(i)\ have power tails and 
infinite second moment we obtain the random intersection graph G(n, m, P*) with asymptoti- 
cally heavy tailed vertex degree distribution without second moment, see j6j and pQ. 
It is known that in some random graph models with a heavy tailed vertex degree distribution 
the typical distance between vertices of the giant component is of order Op(loglogn), see [1], 
|10j . |14j . |15j . |16j . In the present note we extend this bound to the random intersection graph 
model with heavy tailed vertex degree distribution without second moment. 
The paper is organized as follows: results are stated in Section 2. Proofs are given in Section 3. 
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2 Results 



Given an integer sequence {mi,m,2, • • • }, let {(Z n ±, . . . , Z nn ),n = 1, 2, . . . } be a sequence of ran- 
dom vectors with independent coordinates such that for every n, Z n i takes values in {0, 1, . . . , m n }, 
1 < % < n. Let P n i denote the distribution of Z n i. Write P n = (P n i, . . . , P nn ). Fix two 
countable sets {v i , v 2 , . . . } and {w\ , u>2 , ■ ■ ■ } and define the sequence of random intersection 
graphs {G n = G(n,m n ,P n ),n = 1,2,...} as follows. Given n, let S n (vi), . . . , S n (v n ) be in- 
dependent subsets of W n = {w\, . . . ,w mn } of sizes Z n i = Z n {vi) := \S n (vi)\, 1 < i < n, 
such that P(S n (vi) = A) = (7£f) _1 P ni (| A\), for A C W n . G n is the graph on the vertex set 

V n = {v±, . . . ,v n }, where Vi and Vj are adjacent whenever S n (vi) n S n (vj) 7^ 0. Let P n i denote 
the distribution of the random variable Z n i = Z n (vi) := \S n {vi)\^J n/m n . 

Let d n (u,v) denote the distance between vertices u, v G V n in G n (=number of edges in the 
shortest path of G n connecting u and v). Let C\ = Ci{G n ) C V n denote the vertex set of the 
largest connected component of G n . Therefore, the subgraph of G n induced by C\ is connected 
and the number of vertices of any other connected subgraph of G n is not greater than \C\\. A 
vertex u G V n is called maximal in G n if Z n (u) = max„ g y n Z n (v). 

Theorem 1. Let < a < 1 and co,ci,C2 > 0. Let UJ2, ■ ■ ■ } be a sequence of positive 
numbers satisfying lim n o; n = +00. Let {G(n,m n ,P n ),n = 1,2,...} be a sequence of random 
intersection graphs such that 

(i) nln 2 n = o(m n ) as n — > 00; 

(ii) 3 no such that V n > no we have 

Cl r x - a < P(Z ni >t)< c 2 t- 1 ~ a , Vt G [co, n^W], Vi G {1, . . . , n}. (1) 

Lei {u n } be a sequence of maximal vertices, i.e., for every n, the vertex u n G V n is maximal in 
G n . For every e > we have as n — > 00 



P[d(vi,u n ) < (l + ejln-^l/a) ln(ln(2 + n)) d(wa, u n ) < 00J -» 1, (2) 
P(d(«i,«2) < (2 + e)ln- 1 (l/Q)ln(ln(2 + n))Ui,t; 2 G cA 1. (3) 
/fere ' In ' denotes the natural logarithm. 

It follows from ([3]), by the symmetry, that given two vertices v,v' drawn uniformly at random 
from the giant component C\ we have d(v,v') = Op(lnlnn). Recall that such a distance is of 
much larger order Op(lnn) in the corresponding Erdos-Renyi graph (G(n,p) with 1 < c\ < 
np < C2). This remarkable difference is explained by an effect of very large nodes whose degrees 
realize the extremes from a power law distribution, see [T3], |15j . 

Note that with probability tending to 1 (with high probability) every maximal vertex belongs 
to the giant component C\. In addition, as n — ► 00 we have \C\\ > pn, for some p G (0, 1). We 
collect these statements in Remark 1. 

Remark 1. Assume that conditions of Theorem [7] are satisfied. Then 

3pe(0, 1) such that P(|Ci| > pn) — > 1 as n — > 00. (4) 

Let {u n } be a sequence of maximal vertices, i.e., for every n, the (random) vertex u n G V n is 
maximal in G n . Then 

P(u n G Ci(G n )) ->• 1 as n -> 00. (5) 
Acknowledgement. I would like to thank Ilkka Norros for valuable discussion. 
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3 Proofs 



We start with auxiliary Lemmas [T]|3J Then we prove Remark 1, see Lemma [S] below, and 
Theorem 1. 

In what follows we write fain) := ln(ln(n)), where In denotes the natural logarithm. Hj £ m 
denotes the hypergeometric random variable with parameters j,k < m and the distribution 

P(-Hj',fc,m = r) = r ^ r . 

Lemma 1. Let Si,S2 be independent random subsets of the set W = {1,. . . , m} such that S\ 
(respectively S2) is uniformly distributed in the class of subsets ofW of size j (respectively k). 
Then H = \S± Pi S2I is the hypergeometric random variable with parameters j,k,m and mean 
Eif = jk/m. The probability p' := P(H = 0) = (m — k)j/{m)j satisfies, for j + k < m, 

jk/m . jk /?'A\2 , N 

1 " -1 r \ h M < P < 1 - — + - • 6 

1 — (j + fcj/m m m' 

Here we denote (m)j = m(m — 1) • • • (m — j + 1). For < s < 1 and j + k < sm we have 

jk + 2 ifc 2 ^ n S 2 ^ 0) > ^ - ( ^)2_ (?) 
1 — s m m m 



rn 

For X = E.ff and t > we have 



t 2 , . , r t 2 



P(# > A + t) < exp{- 2(A + t/3) }, P(ff < A - t) < exp{- — }. (8) 
In particular, we have 

P{H = 0) < e - jk/2m . (9) 

Proof of LemmaUl Inequalities ([6]) are shown in [12] . Inequalities ([7]) are simple consequences 
of ([6]). We only show the left-hand side inequality for i,j > 1. In this case j + k < 2jk and we 
have 

1 j + fc 2j/c 1 

:= 1 T- \ m7 = 1 + a - 1 + 1 ' 

1 — (j + k) jm m m 1 — s 

Now, desired inequality follows from the left-hand side inequality ([6|). 

Exponential inequalities for hypergeometric probabilities (JSJ) can be derived from the corre- 
sponding inequalities for binomial probabilities, see [9]. Their proof can be found in, e.g., |llj . 
The right-hand side inequality ([5]) applied to t = EH gives ©. □ 

Lemma 2. Given integer m and constants < 71 < 72 < 1 let z\,Z2, . . . ,z r be integers such 
that z = Y^h=i z h < "fim and Zh > 672(72 — 7i)~ 2 Inn > 1, for 1 < h < r. Let Si, S2, ■ ■ ■ , S r 
be independent random subsets of W = {l,...,m} such that, for every h, Sh is uniformly 
distributed in the class of subsets ofW of size z^. Then 

r 

p(|uLi^| > (1 - 72) Yl > 1 " rn' 3 . (10) 

i=l 

Proof of Lemma{^ Write -D[ ] = and, for h > 1, denote Dt h ] = L>k<hSk and S' h = Sh \ D\ h _xy 
Note that |-Dr r i | = Y?h=i \^'h\ — z ' ^ n or der to prove (fTUj) we show that uniformly in h and D\h-i] 
(satisfying < 71m) we have ph ■= P(\S' h \ < (1 — 72)^/1 | D[h-i\) < n ~ 3 - It is convenient 

to write this probability in the form p^ = P(H > 72a), where H denotes the hypergeometric 
random variable with parameters a = Zh, b = |Z)[/ l _ 1 ]| and m. We have Ei? = ab/m < 71a. An 
application of ([8]) shows ph < exp{— 0(72 — 7i) 2 /(272)}. For a = Zh > (672/(72 — 7i) 2 ) Inn we 
obtain ph < n -3 , thus completing the proof. □ 



3 



Lemma 3. Given integers 1 < a,b,d < m, let S a C S d be subsets of the set W = {1, 2, . . . , m} 
of sizes \S a \ = a and \S d \ = d. Here a < d. Let S b be a random subset ofW uniformly distributed 
over the subsets of W of size b. For integers < s < r < t satisfying s < a A b, we have 

p(\s b ns d \>t \s b n s a \ >s)< max p(£T 6 _ iid _ ajm _ a > t - i) + *^ aAm ^ r \ ■ (n) 

V J s<i<r P(H a fi, m > s) 

Assume that d < m/100. Then we have 

v(\S h C\S d \ > 6/2 5 6 n5 a /0) <e~ 6/8 (l + 4^-I{a>6/4,a6<m,6>3}). (12) 

Proof of Lemma\^ Let us prove ([TT]). Introduce events B = {| 5^0 5^1 > i}, A = { | S'b H iS a | > s] 
and write p := P(B|A). Denote pi = P(H b _ id _ a:m _ a > t — i). Let ^ ■ denote the sum over 
subsets C S a of size |.Aj| = j. We have 

p(B n A) = ^2~P{»n{S b nS a = Aj}) 

s<j<aAb j 

= Yl ^P(B|^n5 = ^)p(5 6 n<s a = ^) 

s<j<aAb j 

s<j<aAb j s<j<aAb 

< max ^ P(s < H aAm < r) + P(H afi>m > r). (13) 

s<i<r 

([TT]) follows from {13]) and the identity p = P(B n A)/P(A). 

Let us prove {TJ). Put t = [6/2], s = 1 and r = [6/4J and apply {TT]). We obtain 

Pf |5 6 n S d \ > b/2 S b nS a ^®)< max Pi + p$/p* 2 . (14) 

Here we denote p\ := P{H abm > r), p\ = P{H abm > 1). Let us show that 

Pi < e~ 6/8 , 1 < * < r. (15) 

For this purpose we apply the first inequality of {8]). Denote Aj = Ei^-i,d-a,m-a and ij = 
t — % — Aj. We have, for 1 < i < r and d/m < 100, 

Ai = (6 - i)(d - a)/(m - a) < W/m < 6/100, 
ti > F&/21 - L&/4J - (6/100) > (6/4) - (6/100), 
k < \b/2] - i < 6/2. 

These inequalities combined with the inequality, which follows from dSJ), p% < e _ *?^ 2 ^ Ai+ii / 3 ^ 
imply (|15p. Note that, for a < 6/4, we have p\ = and, therefore, (|12p follows from (|15j) and 

$m). 

Now assume that a > 6/4. Denote A* = EH a)b)m and t* = r + 1 — A*. We have 

A* = ab/m < 6/100, 1 + 6/4 > t* > 6/4 - 6/100. 

Note that 6 > 3 implies i* < (7/12)6. These inequalities combined with the inequality, which 
follows from {5]), p* < e -*»/( 2 (A*+t»/3)) i mp l y 

Pi < e~ b/8 . (16) 
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Finally, we apply ([9|) to get the lower bound 

p* 2 > 1 - e - ab/2m > ab/{4m), (17) 
for ab < m. Invoking ([To ] [TH | [17 ]) in (fll|) we obtain (JT5J. 

□ 

Lemma 4. Let < a < 1 and Co, ci, C2 > 0. Let {uj n } be a positive sequence satisfying uj n — > +oo 
as n — > oo. Lei {(Z n i, . . . , Z nn )} be a sequence of random vectors with independent non-negative 
coordinates satisfying condition (ii) of Theorem 1. We have as n — > oo 

p/ n i/(i+a) / < max z ni < n 1 /(i+«) w ) _> l. (18) 

Lei L n (t) = X^iLi %n$-rt<z •<n 1 /( 1+a ) w„}' ^ ere exists an integer n\ > no depending on a, ci, C2 
and t/ie sequence {uj n } such that, for n > n% and t £ (co, n 1 ^ 1+Q ^) ; «;e /lave 

a i a 

ci/2 < — EL n (t) < c 2 , (19) 

1 + a n 

For 1 < t < 1 + a there exists an integer n 2 > no and number c* > oo£/i depending on 
a,r,ci,C2 and the sequence {uj n } such that, for n > n 2 and t £ (co,n 1 ^ 1+a ' ) ), we have 

P(|A»(t) - EL n (i)| > jEL n (t)) < c* 7 - r n 1 - T t( r - 1 ^ a+1 ). (20) 

Proof of Lemma^A The proof is routine. We include it for the sake of completeness. Denote for 
short U = nW+^/un and T* = Un . 

Let us prove (fTBj) . Write p^{t) := P(maxi<j<„ Z n j < t). It follows from (P) as n — > oo 

Pn(M = II P ^™ ^ '*) ^ i 1 - c i/^ +Q )" ^ exp{- Cl nAi +a } = o(l), (21) 

i 

Pn(r*) = n p (^™ ^ T *) ^ C 1 - c 2 /T, 1+a )" > exp{-c 2 n/(T, 1+a - c 2 )} = 1 - o(l). (22) 

i 

In ([II]) we apply 1 - x < e~ x to x = ci/^ +a . In {22]) we apply 1 - y > e^ 1- ^ to y = 
c 2 /T, 1+a < 1. 

Let us prove (|19H2U|) . Given 1 < r < 1 + a and n, write a^(£) = EZ^Ir t< 2 .<t„i- It follows 
from ([I]) and the identity 

a { l\t) = t T P(t < Z ni <T*)+t J a; T_1 P(x < Z ni < %)dx 

that, for sufficiently large n and t £ (cq, n 1 ^ 1+a )), 

ci/2 < a£ T) (i) 1 |"~ T < c 2 . (23) 

2 1 + a 

Note that the right hand side inequality holds for n > no, while the left hand side inequality 
holds for n > n' , where n' Q = n' (a, r, ci, c 2 , {u> n }) > uq. 
An application of (J23J) to EL n (t) = £^ =1 af^t) shows CE]). 
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Let us show ([20|) . Denote T, = Z n il^ t< ^ _ <T * i — EZ n iIr t< g <T »}- Write, for short, 6 := 7EL n (i). 
By Chebyshev's inequality, 



V 



n (t) :=P(|^T 4 | > 6)<rE|2rj| T . (24) 



i=l i=l 



Invoking the inequalities E| < E E I^T and E l r i| T < 2a 4 (r) W, 1 < t < 2, we obtain 



n 



Pn(t) <2b- T J24 T) (t)- ( 25 ) 
1=1 

It follows from (j23j) that X^i=i a i (^) — c 2 i+a" T • Substitution of this inequality and of 

CGI hi ([23) gives 

8 c 2 1 t^-D^+D 
1 + a — t c\ 7 T n T 1 

thus proving ([20]) . □ 

Lemma 5. Assume that conditions of Theorem 1 are satisfied. Then holds. 
Let V® = {vi : Z ni > n 1 ^ 1 ^/^ (")} c ^n- We have as n -> oo 

P(y n C d(G n )) - 1, (26) 
P(K°| > 2c 2 (/ 2 (n)) Q ( 1+Q )) - 0. (27) 

i/ere |V^| denotes the number of elements of the set V® and Z 2 (n) denotes ln(ln(n). 



Observe that (|18p implies that every maximal vertex of G n belongs whp to V®- Therefore, (|26p 
combined with (|18p imply ([5]). 

Proof of Lemma [3 Let us prove (|27j> . Write t n = ^/(^//"(n). We have 

n 

K°I = E^ ^° : =Ww ( 28 ) 

1=1 

For i = l,...,n, let E 1 " be independent Bernoulli random variables with success probability 
P(I+ = 1) = c 2 t 0n 1_a . It follows from Q, ([28]) that the random variable L + := Ei<i<r,A + is 
stochastically larger than \ V®\. Therefore, for every a > we have 

P(IK°I > a) < P(£ + > a). 

Recall that exponential inequalities ([8]) remain valid if we replace the hypergeometric random 
variable H by a Binomial random variable, see e.g., [TT]. The first inequality of ([5]) applied to 
Binomial probability P(L+ > a) with a = 2EL+ = 2c 2 {h{n)) a ^ +a) shows (|2Tj) . 
Let us prove dH). Let G° be the subgraph of G n obtained by deleting the edges incident to 
vertices from V®. Note that is a random intersection graph defined by the random sets 
Sn(vi), Vi G V n , such that S°(vj ) = S n (vj) for Z ni < t Qn and S®(vi) =0, for Z ni > t 0n . Denote 
Zni ■= Z m\z A} = \S°(vi)\V^- Write 

x 1+a = 2c 2 /ci, a := c , a i+i = a^x, i = 0, 1, . . . , 
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and note that x 1+ct > 2. Let Y n be a discrete random variable with values 0, ao,oi,--. >%' n) 
where j n + 1 = max{i : Oj < ion} ; and with probabilities P(Y n = a,j) = pj, defined by 

Pj := c i a 7 1_Q - c 2°7+r a = c i a 7 1_Q / 2 > J = 0, 1, . . . ,j n . 

Put P(l" n = 0) = 1 — po — pi — ■ ■ ■ — pj n . Note that Y n is stochastically smaller than Z®^ for 
every 1 < i < n. Indeed, ([1]) implies, for j = 0, 1, . . . , j n , 

P(aj < < a,j + i) = P(Z ni > aj) - P(Z ni > a j+1 ) 

\ — 1 — a —I— a 

> cia j - c 2 a j+1 
= pj = P(Y n = aj). 

Let Y n \, . . . , Y nn be independent copies of Y n defined on the same probability space as Z^, 1 < 
i < n and such that almost surely Y n i < Z^, for every 1 < i < n (such coupling is possible 
because Y n i is stochastically smaller than Z^). For 1 < i < n, let be a random subset 

of S®(vi) of size |iS^(vi)| = [Y n i ^/m/n\ (which is uniformly distributed over the class of subsets 
of S®(vi) of size \Y n i^Jrajn\) . Random subsets S®(vi), Vi £ V are independent and identically 
distributed. They define random intersection graph (denoted) which is a subgraph of G^. It 
is easy to see that EY^ — > oo. Therefore, using Theorem 1 and Remark 2 of [2], one can show 
that there exists p £ (0,1) such that the number of vertices Ci(G^) of the largest connected 
component of G° satisfies 

P(|Ci(G£)|>pn)-l. (29) 
The inclusions G° C G° C G n imply \C x (& n )\ < |Ci(G°)| < |Ci(G n )| and, by ([29]), we obtain 

P(|Ci(G„)| > pn) > P(|Ci(G°)| > pn) > P(|d(G°)| > pn) - 1. (30) 

Note that gj) follows from {3D]). 

Let us prove ([26]). Denote 5 = ci/(12(l + c ) 1+a ) > 0. Write t m = (1 + c )(2c 2 /ci) 1 /( 1+Q ) and 
note that for large n we have i* < ton- (DO) implies, for 1 < i < n, 

P(l + co < Z m < u) > JY^Ta ~^ = 65 - < 31 > 

We assume that n is large so that P(Z° i > 1) > 65. Denote 

D = U veCl (Go n )S° n (v), d* = [2Spy/mn\ , k* = [2c 2 (l 2 (n)) a ^ \ . 
Introduce the events 

A = {VveV® : S n (v)nD^Q)}, M = {\D\>d*}, B = {|V n °| < k*}. 
Note that (j26[) follows from the limit P(A) — > 1, which itself follows from (|27p and the limits 

p(B)->i, P(Anini)^i. (32) 



Therefore, in order to prove (|26p it suffices to show (|32|) . 
Let us show the first limit of (1321). Denote 



\S n (v)\, B= Yl \S n (v) n S n (u)\. 
fed (GO) {« >u }cCi(G0) 
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The obvious inequality \D\ > A — B combined with the bounds 

P(B > 8py/mn) = o(l), P(A < ASp^/mn) = o(l) (33) 

implies P(B) —* 0. It remains to prove (|33p . It follows from ([1]) that there exists a number 
C > (depending only on a, cq, c%, C2) such that EZ n ; < C uniformly in n > uq and 1 < £ < n. 
We have 

l<i<j<n 

The bound Ei3 = O(n) in combination with condition (i) of Theorem 1 implies the first bound 
of (|33p . Let us prove the second bound of {J33J) . Write V* =V n \ V£. We call a vertex u G F n * 
large if Z n {v) > 1. Other vertices of V* are called small. Let N* denote the number of large 
vertices in V*. Note that large vertices have higher probabilities of belonging to Ci(G^) than 
small ones. Therefore, the number N of large vertices in Ci(G^) is stochastically larger than 
the number Nq of large vertices in the simple random sample of size |Ci(G^)| drawn without 
replacement and with equal probabilities from the set V*. The obvious inequality A > N \Jrajn 
implies, for s > 0, 

P{A > s)> P{N > s^n/m) > P(N > sy/n/m). (34) 
We shall show that, for s n = A5pn, 

P(iV > s n ) 1. (35) 
Introduce the events H = {|Ci(G°)| > pn} and B* = {N* > 55n} and denote 

p(n) = P({Af >s n }nDnl*ni). (36) 
By the total probability formula, 

P(")=E E Y.P^An)r{\Ci(G° n )\ = h,N* = b,\V:\=n-k). (37) 

h>pn b>5Sn k<k* 

Here Ph,b,k(n) denotes the conditional probability of the event {Nq > s n } given |Ci(G*)| = 
h, N* = b, \ V*\ = n — k. ([8]) applies to the hyper geometric probability Ph,b,k( n ) = P(^h,fe,n-fc > 
s n ) and, for large n, shows Ph,b,k{n) > 1 — n -10 . From (|37[) we obtain 

P (n) > p(inr ne)(i - n - 10 ). (38) 

Note that the law of large numbers combined with (|27p shows P(B*) — > 1. This limit together 
with ([301) and (EZD implies P(B n B* n H) -> 1. The latter limit, fl36|) and (JM} shows (1351) . 
Finally, (|35p combined with (|34p implies the second bound of (|33p . thus completing the proof of 
the limit P(B) -> 1. 

Let us show the second limit of (|32p . The total probability formula gives 

P(AnlnD)=^^ P M (A)P(|£>| = d, \V%\ = k). (39) 

AKfc* cZ>d* 

Here Pk,d denotes the conditional probability given \D\ = d, \V®\ = k. Let S* = {\S n (v)\,v € 
V^*} denote the collection of sizes of sets S n (v) of vertices v G V®. Note that for \V®\ = k, the 
collection S* = {si, . . . , s&} is a multiset. We have 

P M (A)= F k,di A \ S * = {si,...,s k })P k4 (S* = { Sl ,...,s k }). (40) 

{si,...,s k } 
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Here the sum is taken over all possible values {si, . . . , Sk} of the multiset S* of cardinality k. 
The identity 

k 

P M (A|S* = {Si, . . . , Sk}) = [J F ( H Si,d,m > 1) 

combined with ([9]) implies, for large n, the inequality Pfc,d(A|5'* = {si, . . . , s^}) > 1 — n -10 
uniformly in d, k and si, . . . , s k satisfying the inequalities Sj > tony/m/n, d > d*, k < k* . Now 
()40p implies the inequality P^^A) > 1 — n~ 10 , for d > d* and < fe*. Invoking the latter 
inequality in (|39p we obtain 

P(A n B n B) > (1 — n~ 10 )P(B n B) = 1 - o(l). 

In the last step we used (f2T|) and the first bound of ([32]) . The proof of ([32]) is complete. 

□ 



Proof of TheoremUl In the proof we use the approach developed in [13], [16], [13] . 
Before the proof we introduce some notation. Denote 

^ = nVa-K^j" (n), t* = n * /(1+a) fe(«)» fc = 1, 2, . . . , (41) 

fc* = max{£; : n ak ^ 1+a ^ > 100 + c }. 

We use the following simple properties of the sequence For n > 9 we have 

Mi/n = ^ Q (n), t k t-^ 1 = l\" x {n), A; = 2,3,..., (42) 
100 Z 2 (n) < t k , < (100 + co) 1 /f *Z 2 (n), fc* < Z 2 (n)/ ln(l/a). (43) 

Given Z// C V n we denote <S(^0 = U v< =uS n (s). Throughout the proof limits are taken as n — > oo. 
Given n, write m = m n and T = T n = n l /^ l+a ^oj n . Fix 1 < r < 1 + a. By c*, ci-j, . . . we denote 
positive constants that may depend only on a, r, cq, c%, c 2 . 
Let us prove ([2]). Fix a maximal vertex u n of G ra . We have 

w; , , , m ,/ \ \ P(d(ui,Un) > d( Vl ,u n ) < oo) 

P[d(vi,u n ) > fc* +e< 2 (n) | d(t;i,M n ) < oo) = 



P(d(«i,u n ) < oo) 
In order to prove ([2]) we shall show that 

P(d(v 1 ,u n ) > h +e/ 2 (n), d{vi,u n ) < oo) = o(l), (44) 

liminf P(d(ui,w n ) < oo) > 0. (45) 

n 

Let us prove (|35]). Write Ci = Ci(G n ). It follows from fl3J) that P(u n G Ci) = 1 - o(l). 
Therefore, we have 

P(d(v u u n ) < oo) > P(vi,n n G Ci) = P(«i G Ci) - o(l). (46) 
Inequalities (|30p imply E|Ci| > pn(l — o(l)) and, by symmetry, we obtain 

P(«l G Ci) = n- 1 ^ F ( v G Cl) = ™ _1 E|Ci| > p(l - o(l)). 



This inequality combined with (|46p implies ()45p . 
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Let us prove (|4"4"|) . Introduce the sets 

= {u n }, U k = {vj : t k < Z nj < T}, k = 1, 2, . . . , k*, 
= {vj : 1 < Z n j < t*}. 

Denote Q k = Ylv&u k \Sn{v)\ and q k = EQfc- Introduce the events 

A = {to < Z n {u n ) < T}, 

A fc = {q k /2 < Q k < (3/2)%}, k = 1,2, . . . , fc*. 
A*i = > 5n5}, A* 2 = {\Uk*\ < rc/^Cn)}, 

Here 5 > is defined in ([3T]) above. Denote A = ^H^gA^ n A*i n A*2. Let us show that 

P(A) -» 1. (47) 

(|47p follows from the limits 

P(A*i) -► 1, i = 1,2, P(A ) -> 1, P(ni< fc <fc,A fc ) 1. (48) 

An application of Chebyshev's inequality to the binomial random variables \U kit \ and \lf*\ gives 
the first limit of (|48|). The second limit of (|48p is shown in (|18p . To show the third limit of (|48|) 
we write 

l-P(ni< fc < fc ,A fc ) = P(Ui< fc < fc ,A fc ) < p (^)- 

l<fc<ifc» 

Here A& denotes the event complement to A k . Combining the bound, which follows from (|20p . 

P(A fc ) < ^n^-^ T -^[l 2 (n)) (a+1)( - T - 1) 

and the bound, see (f4~3|) . A;* = 0(?2(^)) we obtain Yli<k<k P(Afc) = o(l), thus showing the 
third limit of (fl8"]h We arrive at (|47|) . 

In the remaining part of the proof we shall assume that the event A holds. Let P, E and 
G denote the conditional probability, the conditional expectation, and the conditional random 
graph G n given Z n \^ . . . , Z nn . Write V — V \U kt and let G denote the subgraph of G induced 
by V*. Given v £ V* define d*(v) = mm{d(w,v) : w G U kt }- We shall show that uniformly in 
Z n \ , . . . , Z nn satisfying A and uniformly in v G V * , u G U kt 

P(d*(u) > £l 2 (n),d*(v) < oo) = o(l), (49) 
P(d(u,u n )>h) = o(l). (50) 

It follows from (|49p that a vertex v £ V satisfying d(v,u n ) < oo finds whp a path of length at 
most I2 (n) to a vertex u £ . (f50j) then applies to u and together with (flUj) imply 

P(d(vi,u n ) > k* + eh(n), d(vi,u n ) < 00) = o(l). (51) 

The bound (J5IJ combined with (@7]) shows (gH). It remains to prove (|49l I50p . 
Proof of |^P| ). For simplicity of notation we put e = 1. Given v G V* denote L* = {V G V* : 
d*(v,v') < hin)}- Here d* denotes the distance between vertices of the graph G*. Introduce the 
event B* = {|S(L*)| > 5 l 2 (n)y/m/n}. The event 

> Z 2 (»), < 00} C {S(L*) n 5(Wjfc.) = 0, |L*| > Z 2 (n)} 

c ({5(L*)n5(w fc .) = 0}ni*) u (I*n{|L*| > (")})■ 
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Here B* denote the event complement to B*. We have 

P(d*(w) > l 2 (n), d*(v) < oo) <p' + p", (52) 

where 

p' = P({5(L*) n S(Uy) = 0} n B#), p" = P(l» n {|L*| > 
We shall show that 

p' = o(l), p" = o(l) as n — > 00. (53) 

Let us prove the first bound. (fl~9j) and (jl3|) imply > 4c^ \fmnj (l2(n)) a . Invoking the 
inequality Q k , > <Zfc«/2 an d the inequality, which follows from Lemma 2, P(|S , (^,)| > Qfc„/2) = 
l-o(l) we obtain the bound l-P(B') = o(l) for the event B' = {\S(U k J\ > c^^4nn/(l2{n)) a }. 
Therefore, we have 

p' = P({S(L t )nS(^,) = 0}nB'nB t ) + o(i) 

< P(5(L*)n5(W fc J =0|B',B*)+o(l) 
= o(l). 

In the last step we applied ([9]) to the random variable H = \S(L*) n S(Uk*)\ conditionally, given 
|S(L*)| and \S(U k J\. 

Let us show the second bound of ([53]) . Denote k' = lh(n)\- Let {i4,u 2 ,... ,v' n /} be an enu- 
meration of elements of V*. We call v[ smaller than v'a whenever i < j. We call v' £ V* 
large if Z n (v') > 1. Paint elements of V* white. Given v G V* we construct the 'breath 
first search' tree T v in G* with the root v as follows. Paint vertex v black and write tq = v. 
White vertices are checked in increasing order and those found adjacent to To are painted black. 
Denote them t\ < t% < • • • < . After all neighbours of To have been found the vertex tq 
is called saturated. Then proceed recursively: take the first available black unsaturated ver- 
tex, say Ti (here i = min{j : tj is black and unsaturated }), and find its neighbours among 
remaining white vertices. Do this by checking white vertices in increasing order. After all 
white neighbours of Tj have been found the vertex Tj is called saturated, the neighbours are 
denoted Tj i _ 1+ \ < t^ 1+ 2 < • • • < Tj i and painted black. We call Tj the parent vertex of 
its children t^ . . . ,Tj i . In this way we obtain the list L = {to,ti, . . . } of vertices of the 
tree T v . Denote L r = {To,...,T r }. Let ./V denote the number of large vertices in the set 
Ly . We say that (player) v receives a yellow card at step r > 1 if vertex t t is large and 
\S(L r -i) n S n (T r )\ > 2~ 1 \S n (T r )\. The event that v receives the first yellow card at step r is 
denoted B r . On the event M := (n^B,) n {iV > 45k'} we have 

\S(L k ,)\ > 2- 1 N^m/n > 6 l 2 {n) s/rn/n. (54) 

Note that the inequality |L*| > l2(n) implies \L\ > k' + 1. Therefore, we have 

p" <P(B* n{|L| >k' + l}). 

Furthermore, for \L\ > k' + 1, the inclusion Ly C implies \S(Ly)\ < \S(L*)\ and in view of 
(|54p we conclude that events EI and B^ n {\L\ > k' + 1} do not intersect. We have 

p(B*n{|L| >k' + i}) =P(%n{\L\ >^' + i}ni) <p1+p* 2 , 

where 

p\ := P({|L| > k' + 1} n {N < 45k'}), p*2 := P{^,' =1 M r ). 
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In order to prove the bound p" = o(l) we shall show that p* = o(l), i = 1, 2. 
Write 

pi = P(|L| > k' + l)p, p:=P(N <45k'\\L\ > k' + l). (55) 

Since large vertices have higher probabilities to join the list L than the other vertices we conclude 
that the random variable N is stochastically larger than the number Nq of large vertices in the 
simple random sample of size k' + l drawn without replacement and with equal probabilities 
from the set V* . In particular, we have 

p < P{N < 45k'). (56) 

Note that on the event A*i n A* 2 we have EiVo > 55(k' + 1) and VariVo = O(k'). Therefore, 
Chebyshev's inequality implies P(Nq < 45k') = 0(l/k') = o(l). This bound combined with 
(155]) and ([56]) implies the bound pi = o(l). 

In order to prove the bound p\ = o(l) we write p\ < Sr=i P(®r) an d show that 

P(B r ) < n" 10 , (57) 

for every r and large n. Before the proof of (|57p we introduce some notation. For i > 1 denote 
Wt = W\5'(L i _i), S'(n) = S n (n) \ rrn = \Wi\, 8i = \S n (n)\, s[ = \S'{n)\. Put 

Wq = W. Fix r > 1. Let r r * denote the parent vertex of r r . Denote D r = S(L r _i) n W r * and 
d r = \D r \. We have P(B r | W r *, D r , S'(t r *), s r ) <p*, where 

p* := P r *{\D r n S n (j r )\ > 2~ 1 8 r S n (T r ) D S'(r r .) + 0) . (58) 



Here P r * denotes the conditional probability P given W r * , D r , S'(t r *), s r . Note that in (|58p 
values of all random variables are fixed (given), but S n (r r ) which is a random set uniformly 
distributed in the class of subsets of W r * of given size s r satisfying s r > ^Jmjn (because r r is a 
large vertex). It follows from (|12|) that for large n we have 

P* < e" s ' /8 (l + I6m r */s 2 r ) < e" 8 " 1 + 16n) < n" 10 . (59) 

In the last step we applied condition (i) of Theorem 1. ([59]) implies ([57]) thus completing the 
proof of ([55]). We arrive to ([23]) . 

Proof of Given Uq G £4„ finds a neighbour in Uk^-i, say, u' x with probability at least 

min P(S n (u) n S(Z4._i) ^ 0) =: pj. 

Similarly, given € Uk*-j finds a neighbour in U^-j-i, say with probability at least 

min P(S„(u) n S(Z4,-;-i) + 0) == P;, 

and so on. In this way we may construct a path (namely, u^u^u^, ... ,u' k = u n ) of length 
at most k* connecting u n with an arbitrary vertex u' Q from Uk t . The probability that such a 
construction fails is at most X^o*^ ~~ Pj)- ^ n particular, for any given u G we have 

fc„-i 

P(d(u,u n )>h) < 

j=0 
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In order to prove (|50p we shall show that, for some c\ > and large n, 

1 - P* < e~ c ^ h( - n ^ a + n~ 2 , 0<j<h-l. (60) 

Fix 1 < i < k* — 1 and u £ Wi+i- On the event A we have, for large n, 

„ a; ci 1 + a J ran , . 

Qi > — > — ^ , 61) 

2 4 Q! 

where the second inequality of (|6ip follows from (|19p . Denote c\ = yg"4? and introduce the 
event B = > 2c^y/mnt~ a }. It follows from Lemma 2 (applied to 71 = 1/10 and 

72 = 1/2) that 

1 - n~ 2 < P(|S(Z4)| > Qi/2) < P(B). (62) 

Here in the last step we invoke ()6ip . Next we apply ([9]) to the hypergeometric random variable 
H = \S n (u) n S(Ui)\, where |5 n (u)| and \S{Ui)\ are given and satisfy |5 n (n)| > ti + \^Jm/n and 
\S{Ui)\ > 2c%^/m^t- a . We obtain 

P{S n (u) n S(Ui) = |B) < expj-c^} 

= expj-c^n)) 1 ^}. (63) 

Combining (|62|) and (|63|) we obtain (|60|). for j = k* — i — 1 satisfying < j < A;* — 2. The proof 
of ([60]) for j = k* — 1 is similar but simpler. We arrive to ([50]) thus completing the proof of ([2]). 

Let us prove ([3]). Denote a n = (1 + e/2)(l/a) ln(ln(2 + n)) and introduce the events 

D = {ui,U2 G Ci}, G = {d(ui,u 2 ) > 2on}, G, ; = {d(vi, u n ) > a n }, i = l,2. 

Note that ([3]) is equivalent to the limit P(G|D) = o(l). In order to prove ([3|) we shall show that 
that there exists p > such that 

liminfP(D) > p 2 , (64) 
P(GnD) = o(l). (65) 

Let us prove (fM|) . It follows from the identity \C\\ = YlveV ^{«eCi}i D Y the symmetry, that 

E|Ci| 2 =E^I 2 - vgC , i} + E ^ I{ M ,«eCi} 
«ev {u,t)}ey 
= E| Ci I +n(n- l)P(B). 

This identity combined with \C\\ < n and the inequality, which follows from E|Ci| 2 > 
n 2 p 2 (l - o(l)) shows ([Ml) . 

Let us prove ([53]) . In view of © in suffices to show that p := P(G flDn {u n G Ci}) = o(l). We 
have p < pi + P2-, where pi = P(d(vi,u n ) > a n , d(vi,u n ) < 00), i = 1,2. Finally, ([4"i"|) implies 
Pi = o(l) thus completing the proof of ([55]) . 

□ 
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